Three IQs of AI Systems and their Testing Methods

نویسندگان

Feng Liu

Yong Shi

Ying Liu

چکیده

The rapid development of artificial intelligence has brought the artificial intelligence threat theory as well as the problem about how to evaluate the intelligence level of intelligent products. Both need to find a quantitative method to evaluate the intelligence level of intelligence systems, including human intelligence. Based on the standard intelligence system and the extended Von Neumann architecture, this paper proposes General IQ, Service IQ and Value IQ evaluation methods for intelligence systems, depending on different evaluation purposes. Among them, the General IQ of intelligence systems is to answer the question of whether "the artificial intelligence can surpass the human intelligence", which is reflected in putting the intelligence systems on an equal status and conducting the unified evaluation. The Service IQ and Value IQ of intelligence systems are used to answer the question of “how the intelligent products can better serve the human”, reflecting the intelligence and required cost of each intelligence system as a product in the process of serving human. 0. Background With AlphaGo defeating the human Go champion Li Shishi in 2016[1], the worldwide artificial intelligence is developing rapidly. As a result, the artificial intelligence threat theory is widely disseminated as well. At the same time, the intelligent products are flourishing and emerging. Can the artificial intelligence surpass the human intelligence? What level exactly does the intelligence of these intelligent products reach? To answer these questions requires a quantitative method to evaluate the development level of intelligence systems. Since the introduction of the Turing test in 1950, scientists have done a great deal of work on the evaluation system for the development of artificial intelligence[2]. In 1950, Turing proposed the famous Turing experiment, which can determine whether a computer has the intelligence equivalent to that of human with questioning and human judgment method. As the most widely used artificial intelligence test method, the Turing test does not test the intelligence development level of artificial intelligence, but only judges whether the intelligence system can be the same with human intelligence, and depends heavily on the judges’ and testees’ subjective judgments due to too much interference from human factors, so some people often claim their ideas have passed the Turing test, even without any strict verification. On March 24, 2015, the Proceedings of the National Academy of Sciences (PNAS) published a paper proposing a new Turing test method called “Visual Turing test”, which was designed to perform a more in-depth evaluation on the image cognitive ability of computer[3]. In 2014, Mark O. Riedl of the Georgia Institute of Technology believed that the essence of intelligence lied in creativity. He designed a test called Lovelace version 2.0. The test range of Lovelace 2.0 includes the creation of a virtual story novel, poetry, painting and music[4]. There are two problems in various solutions including the Turing test in solving the artificial intelligence quantitative test. Firstly, these test methods do not form a unified intelligent model, nor do they use the model as a basis for analysis to distinguish multiple categories of intelligence, which leads to that it is impossible to test different intelligence systems uniformly, including human; secondly, these test methods can not quantitatively analyze artificial intelligence, or only quantitatively analyze some aspects of intelligence. But what percentage does this system reach to human intelligence? How’s its ratio of speed to the rate of development of human intelligence? All these problems are not covered in the above study. In response to these problems, the author of this paper proposes that: There are three types of IQs in the evaluation of intelligence level for intelligence systems based on different purposes, namely: General IQ, Service IQ and Value IQ. The theoretical basis of the three methods and IQs for the evaluation of intelligence systems, detailed definitions and evaluation methods will be elaborated in the following. 1. Theoretical Basis: Standard Intelligence System and Extended Von Neumann Architecture People are facing two major challenges in evaluating the intelligence level of an intelligence system, including human beings and artificial intelligence systems. Firstly, artificial intelligence systems do not currently form a unified model; secondly, there is no unified model for the comparison between the artificial intelligence systems and the human at present. In response to this problem, the author's research team referred to the Von Neumann Architecture[5], David Wexler's human intelligence model[6], and DIKW model system in the field of knowledge management[7], and put forward a "standard intelligent model", which describes the characteristics and attributes of the artificial intelligence systems and the human uniformly, and takes an agent as a system with the abilities of knowledge acquisition, mastery, creation and feedback[8] (see Figure 1). Figure 1 Standard Intelligence Model Based on this model in combination with Von Neumann architecture, an extended Von Neumann architecture can be formed (see Figure 2). Compared to the Von Neumann architecture, this model is added with innovation and creation function that can discover new elements of knowledge and new laws based on the existing knowledge, and make them stored in the storage for use by computers and controllers, and achieve knowledge interaction with the outside through the input / output system. The second addition is an external knowledge database or cloud storage that enables knowledge sharing, whereas the Von Neumann architecture's external storage only serves the single system. A. Arithmetic logic unit D. innovation generator B. Control unitE. input device C. Internal memory unit F. output device Figure 2 Expanded Von Neumann Architecture 2. Definitions of Three IQs of Intelligence System 2.1 Proposal of AI General IQ (AI G IQ) Based on the standard intelligent model, the research team established the AI IQ Test Scale and used it to conduct AI IQ tests on more than 50 artificial intelligence systems including Google, Siri, Baidu, Bing and human groups at the age of 6, 12, and 18 respectively in 2014 and 2016. From the test results, the performance of artificial intelligence systems such as Google and Baidu has been greatly increased from two years ago, but still lags behind the human group at the age of 6[9] (see Table1 and Table 2). Table 1. Ranking of top 13 artificial intelligence IQs for 2014.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی کنش‌های شناختی دانش‌آموزان دارای لکنت

Objective Stuttering is one of the most common speech disorders that generate many complications in children and adults. This disorder involves behavioral, cognitive and emotional interactions. So, the purpose of the current study is to investigate the cognitive functions of students with stuttering. Materials & Methods A descriptive study, comprising of 30 students (8 females and 22 males) fr...

متن کامل

Using Program Slicing Technique to Reduce the Cost of Software Testing

Systems of computers and their application in the lives of modern human beings are vastly expanding. In any kind of computer application, failure in computer systems can lead to a range of financial and mortal losses. Indeed, the major origin of software failure can be located in designing or implementing software. With regard to these statistics, 30% of the software projects have been prospero...

متن کامل

تأثیر زمان های مختلف شروع پروتکل اوسینک پس از تلقیح بر شاخص های هم‏زمانی و باروری گاوهای هلشتاین

The objective of this study was to consider effect of three different time of starting Ovsynch after Artificial Insemination, on synchrony parameters, pregnancy loss and fertility in Holstein cows. Multiparous cows (> 30 Kg/d, n = 450) randomly assigned to one of three groups. Groups according to time of starting Ovysnch were as follows: 1- Starting Ovsynch on day 25 after AI, 2- Starting Ovsyn...

متن کامل

ABS methods for nonlinear systems of algebraic equations

Abstract This paper gives a survey of the theory and practice of nonlinear ABS methods including various types of generalizations and computer testing. We also show three applications to special problems, two of which are new.

متن کامل

Hovers Systems Simulation Using the Origami Science and Testing It's Mechanical Resistance

Purpose: The main concern of this research is a fundamental study in ruling elements of Hovers bone system and making relationship between Bionic and Cybernetic sciences in order to make a better understanding of different sciences and effective use of biologic models and systems for industrial designs. Materials and Methods: the research was done with Origami science, using paper as a basic m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1712.06440 شماره

صفحات -

تاریخ انتشار 2017

Three IQs of AI Systems and their Testing Methods

نویسندگان

چکیده

منابع مشابه

بررسی کنش‌های شناختی دانش‌آموزان دارای لکنت

Using Program Slicing Technique to Reduce the Cost of Software Testing

تأثیر زمان های مختلف شروع پروتکل اوسینک پس از تلقیح بر شاخص های هم‏زمانی و باروری گاوهای هلشتاین

ABS methods for nonlinear systems of algebraic equations

Hovers Systems Simulation Using the Origami Science and Testing It's Mechanical Resistance

عنوان ژورنال:

اشتراک گذاری